1,999 research outputs found

    Ward's Hierarchical Clustering Method: Clustering Criterion and Agglomerative Algorithm

    Full text link
    The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. However there are different interpretations in the literature and there are different implementations of the Ward agglomerative algorithm in commonly used software systems, including differing expressions of the agglomerative criterion. Our survey work and case studies will be useful for all those involved in developing software for data analysis using Ward's hierarchical clustering method.Comment: 20 pages, 21 citations, 4 figure

    A semi-parametric approach to estimate risk functions associated with multi-dimensional exposure profiles: application to smoking and lung cancer

    Get PDF
    A common characteristic of environmental epidemiology is the multi-dimensional aspect of exposure patterns, frequently reduced to a cumulative exposure for simplicity of analysis. By adopting a flexible Bayesian clustering approach, we explore the risk function linking exposure history to disease. This approach is applied here to study the relationship between different smoking characteristics and lung cancer in the framework of a population based case control study

    The Cost of Simplifying Air Travel When Modeling Disease Spread

    Get PDF
    BACKGROUND: Air travel plays a key role in the spread of many pathogens. Modeling the long distance spread of infectious disease in these cases requires an air travel model. Highly detailed air transportation models can be over determined and computationally problematic. We compared the predictions of a simplified air transport model with those of a model of all routes and assessed the impact of differences on models of infectious disease. METHODOLOGY/PRINCIPAL FINDINGS: Using U.S. ticket data from 2007, we compared a simplified "pipe" model, in which individuals flow in and out of the air transport system based on the number of arrivals and departures from a given airport, to a fully saturated model where all routes are modeled individually. We also compared the pipe model to a "gravity" model where the probability of travel is scaled by physical distance; the gravity model did not differ significantly from the pipe model. The pipe model roughly approximated actual air travel, but tended to overestimate the number of trips between small airports and underestimate travel between major east and west coast airports. For most routes, the maximum number of false (or missed) introductions of disease is small (<1 per day) but for a few routes this rate is greatly underestimated by the pipe model. CONCLUSIONS/SIGNIFICANCE: If our interest is in large scale regional and national effects of disease, the simplified pipe model may be adequate. If we are interested in specific effects of interventions on particular air routes or the time for the disease to reach a particular location, a more complex point-to-point model will be more accurate. For many problems a hybrid model that independently models some frequently traveled routes may be the best choice. Regardless of the model used, the effect of simplifications and sensitivity to errors in parameter estimation should be analyzed

    The genetic contribution of single male immigrants to small, inbred populations: A laboratory study using drosophila melanogaster

    Get PDF
    This study examined the genetic contribution of single male immigrants to small, inbred laboratory populations of Drosophila melanogaster. Genetic contribution was assessed by measuring the relative frequency of immigrant marker alleles in the first and second generations after immigration, while controlling for any selection effects at the marker locus, and for the experience of male immigrants. When immigrants were outbred, the mean frequency of the immigrant allele was significantly higher than its initial frequency, in both the first and second generations after immigration. There was no significant change in allele frequency for populations receiving inbred immigrants. The increase in allele frequency for outbred immigrants was attributed to an initial outbred vigour fitness advantage of immigrant males over resident males experiencing inbreeding depression. Hybrid vigour of immigrant progeny and the rare-male effect did not have a statistically significant role in the fitness advantage of the immigrant allele. The results suggest that inbreeding may have a considerable impact on the contribution of immigrants to the genetic diversity of populations

    Identifying discrete behavioural types: A re-analysis of public goods game contributions by hierarchical clustering

    Get PDF
    We propose a framework for identifying discrete behavioural types in experimental data. We re-analyse data from six previous studies of public goods voluntary contributions games. Using hierarchical clustering analysis, we construct a typology of behaviour based on a simi- larity measure between strategies. We identify four types with distinct sterotypical behaviours, which together account for about 90% of participants. Compared to previous approaches, our method produces a classification in which different types are more clearly distinguished in terms of strategic behaviour and the resulting economic implications

    Multifrequency Strategies for the Identification of Gamma-Ray Sources

    Full text link
    More than half the sources in the Third EGRET (3EG) catalog have no firmly established counterparts at other wavelengths and are unidentified. Some of these unidentified sources have remained a mystery since the first surveys of the gamma-ray sky with the COS-B satellite. The unidentified sources generally have large error circles, and finding counterparts has often been a challenging job. A multiwavelength approach, using X-ray, optical, and radio data, is often needed to understand the nature of these sources. This chapter reviews the technique of identification of EGRET sources using multiwavelength studies of the gamma-ray fields.Comment: 35 pages, 22 figures. Chapter prepared for the book "Cosmic Gamma-ray Sources", edited by K.S. Cheng and G.E. Romero, to be published by Kluwer Academic Press, 2004. For complete article and higher resolution figures, go to: http://www.astro.columbia.edu/~muk/mukherjee_multiwave.pd

    Emergent global patterns of ecosystem structure and function from a mechanistic general ecosystem model

    Get PDF
    Anthropogenic activities are causing widespread degradation of ecosystems worldwide, threatening the ecosystem services upon which all human life depends. Improved understanding of this degradation is urgently needed to improve avoidance and mitigation measures. One tool to assist these efforts is predictive models of ecosystem structure and function that are mechanistic: based on fundamental ecological principles. Here we present the first mechanistic General Ecosystem Model (GEM) of ecosystem structure and function that is both global and applies in all terrestrial and marine environments. Functional forms and parameter values were derived from the theoretical and empirical literature where possible. Simulations of the fate of all organisms with body masses between 10 µg and 150,000 kg (a range of 14 orders of magnitude) across the globe led to emergent properties at individual (e.g., growth rate), community (e.g., biomass turnover rates), ecosystem (e.g., trophic pyramids), and macroecological scales (e.g., global patterns of trophic structure) that are in general agreement with current data and theory. These properties emerged from our encoding of the biology of, and interactions among, individual organisms without any direct constraints on the properties themselves. Our results indicate that ecologists have gathered sufficient information to begin to build realistic, global, and mechanistic models of ecosystems, capable of predicting a diverse range of ecosystem properties and their response to human pressures
    • …
    corecore